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Abstract. We discuss in some detail the general problem of computing aver- 
ages of convergent Euler products, and apply this to examples arising from sin- 
gular series for the fc-tuple conjecture and more general problems of polynomial 
representation of primes. We show that the "singular series" for the fc-tuple 
conjecture have a limiting distribution when taken over fc-tuples with (distinct) 
entries of growing size. We also give conditional arguments that would imply 
that the number of twin primes (or more general polynomial prime patterns) 
in suitable short intervals are asymptotically Poisson distributed. 



1. Introduction 

Eulcr products over primes are ubiquitous in analytic number theory, going back 
to Euler's proof that there are infinitely many prime numbers based on the behavior 
of the zeta function £(s) as s — > 1. As defining L-functions of various types, Euler 
products are particularly important, and their properties remain very mysterious. 
In this paper, we consider the issue of the average or statistical behavior of another 
important class of Euler products, the so-called singular series, arising in counting 
problems for certain "patterns" of primes (singular series also occur in many prob- 
lems of additive number theory or diophantine geometry, but we do not consider 
these here). 

The first type of prime patterns are the prime ^-tuples, which are the subject 
of a famous conjecture of Hardy and Littlewood. Let k 1 be an integer and let 
h = (hi, . . . , hk) be a /c-tuple of integers with hi ^ 1 for all i. Let then 

tt(N; h) ~ \{n ^ N \ n + hi is prime for 1 ^ i ^ k}\ 

be the counting function for primes represented by this fc-tuple; note that, for 
instance, h = (1, 3) leads to the function counting twin primes up to N. 
For any prime number p, let v p (h) denote the cardinality of the set 

{hi,...,h k }(modp) 

of the reductions of the hi modulo p. Note that 1 ^ v p (h) ^ min(fc,p) for all p, 
and that if we assume (as we now do) that the h^s are distinct, then v p (h) = k for 
all sufficiently large p. 
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The singular series associated with h is defined as the Euler product 

<"> -w-no-^xi-ir-mi-^Hr 

which is absolutely convergent (as will be checked again later; here and throughout 
the paper, as usual, p is restricted to prime numbers). 

The significance of this value is found in the Hardy-Littlewood prime fc-tuple 
conjecture (originally stated in [HLj h which states that we should have 

N 

(1.2) n(N; h) = &{h) (1 + o(l)), as N -> +co, 

and in particular, if &(h) ^ 0, there should be infinitely many integers n such 
that n + hi, ■ ■ . , n + hk are simultaneously prime. Of course, if k ^ 2, this is still 
completely open, but let us mention that from sieve methods, it follows that 

7r(JV;/i)<2*fc!(l + (l))6(h)^^ 

as N -> +oo (see, e.g., [IK] Th. 6.7] or HR, Ch. 4, Th. 5.3]), showing that the 
singular series does arise naturally. Also some other previously inaccessible additive 
problems with primes, related to counting arithmetic progressions (of fixed length) 
of primes are currently being attacked with striking success by B. Green and T. 
Tao (see [GT]). 

More generally, one considers polynomial prime patterns. First, a finite family 
/ = (/i,...,/m) of polynomials in Z[X] of degrees deg(/j) ^ 1 is said to be 
primitive if the fj are distinct, and each fj is irreducible, has positive leading 
coefficient, and the gcd of its coefficients is 1. 

If / is primitive, we say that an integer n ^ 1 is an /-prime seed if /i(n), . . . , 
f m (n) are all (positive) primes. Then we denote by 

ir(N; f) — \{n ^ N | n is an /-prime seed} 

for N ^ 1 the counting function for those prime seeds. Moreover, let 

m 

peg(/) = n de S^)- 

i=i 

A generalization of the fc-tuple conjecture, due to Bateman and Horn [BH Q 
states that 

1 N 

(1.3) n(N;f) & (f)———, as N -> +co, 

peg(/) (logiV) m 

if ©(/) ^ 0, whersfl 

(1.4) e ( /)=n(i-^)(i-i 

v 

with v p {f) being now the number of x G Z/pZ such that fj(x) = for some j, 
1 < j < m. 



1 The qualitative version of which is due to Schinzel [5]. 

2 Here, except in the special case where all fj are linear, the singular series 6(f) is not 
absolutely convergent (see below for more details on this; the problem is that v p (f) is only equal 
to m on average over p, and not for all p large enough, except if each fj is linear) ; the product is 
thus defined as the limit of partial products over primes p ^ y. 
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The Hardy-Littlewood conjecture for a fc-tuple h is equivalent with this conjec- 
ture for the primitive family 

f = (X + h 1 ,...,X + h k ) 

for which v p (h) as defined previously does coincide with v p (f). 

Our goal is to study various averages of singular series, for which there is un- 
doubted arithmetic interest. A result of Gallagher |Gaj states that 

(1.5) Jjm^E'eW-l, 

|/i|<h 

* 

for any fixed fc, as h — > +00, where \h\ = max/i^ and restricts to fc-tuples with 
distinct components. This property was used by Gallagher himself to understand 
the behavior of primes in short intervals (see also the recent work by Montgomery 
and Soundararajan [MS] ), and it is also important the remarkable results of Gold- 
ston, Pintz and Yildirim concerning small gaps between primes (see [GPY] or the 
survey |Klj ). 

Our first question is to ask about finer aspects of the distribution of &(h). To 
apply the method of moments, we first prove the following: 

Theorem 1.1. Let k 1 be fixed. For any complex number m G C with Re(m) , 
there exists a complex number (m) such that 

, lim iE*6(hr = ft (m). 

Moreover, for m, k 1 both integers, we have the symmetry property 

(1.6) A*fe(m) = fx m (k) ; 

in addition, we have /Ii(to) = 1 for all integers m ^ 1, and hence /^tfc(l) = 1 for all 
k > 1. 

The last statement (/ifc(l) = 1) is of course Gallagher's theorem (|1.5[1 : our proof 
is not intrinsically different, but maybe more enlightening. These results are in fact 
quite straightforward, and only the final symmetry in k and m is maybe surprising. 
However, its origin is not particularly mysterious: it is a "local" phenomenon, and 
it can be guessed from (jl.2p by a formal computation. 

We will also find estimates for the size of the moments which are good enough 
to imply the existence of a limiting distribution of &{h) for fc-tuples (fc fixed): 

Theorem 1.2. Let k ^ 1 be fixed. There exists a probability law Vf. on R + = 

[0, +oo[ such that 6(h), for h with \h\ ^ h and h — > +00, becomes equidistributed 
with respect to v^, or equivalently 

, lim Tk E*/( 6 W)= / /(*)**(*) 

h K / R + 

\h\^h 

for any bounded continuous function on R. 

The second question we explore is the generalization to other prime patterns 
of the result of Gallagher (based on (|1.5[) ) that shows that a uniform version of 
the prime fc-tuple conjecture implies that for a fixed A > 0, the distribution of 
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7r(x + Alogx) — n(x) is close to a Poisson distribution of parameter A as x — > +oc, 
i.e., it implies that 

1 A m 
(1.7) — \{n < N I ?r(n + h) - ir(n) = m}| -> e~ A — -, as TV -)• +00, 

iV to! 

for any integer m 0. It turns out that, indeed, under a general uniform version of 
the Bateman-Horn conjecture, for any fixed primitive family /, the number of f- 
prime seeds in short intervals of "fair" length (i.e., intervals around n in which (jl.3[) 
predicts that, on average, there should be a fixed number of /-prime seeds) always 
follows a Poisson distribution. As for the symmetry property of the higher moments 
for the singular series related to A-tuple conjecture, this turns out to depend primar- 
ily on local identities, but we found this rigidity of patterns to be quite surprising 
at first sight. Precisely: 



<L8 > *< W; A = ^ s «w( 1 + (S)) 



Theorem 1.3. Assume that the Bateman-Horn conjecture holds uniformly for all 
primitive families with non-zero singular series, in the sense that 

P eg(/r lJj (logA0 r 
holds for all primitive families f, all e > 0, and all N ^ 2, where 

c(/)= V H(f 3 ), H(ao + a 1 X + --- + a d X d ) = mnx\a i \, 

* — ' i 

and the implied constant depends at most on the degrees of the elements of f and 
on e. 

Let f be a fixed primitive family with 6(f) ^ 0. For N ^ 1, let 

5(N,f) = ?l0±(logNr. 

Then for any A > and any integer r ^ 0, we have 

lim i-|{n < N I n(n + XS(N, /); /) - n(n- f) = r}| - e" A ^. 

In other words, for N large, the number of f -prime seeds in an interval around 
N 5* 1 of length A(logJV) m is asymptotically distributed like a Poisson random 
variable with mean given by £>(/) peg(/) _1 A. 



The final purpose of this paper is to emphasize the fact that Theorems ll.ll and ll.3l 
are special cases of the problem of computing the average of some families of values 
of Euler products, and (because here the Euler products are absolutely convergent 
or almost so) the outcome is consistent with the heuristic that the p-factors are 
independent random variables, so the average of the Euler product is the product 
of "local" averages. All this is a fairly common theme in analytic number theory, 
but our presentation is maybe more systematic than usual. The works of Granvillc- 
Soundararajan |GSj and Cogdell-Michel [CM also present this point of view very 
successfully for values of certain families of L-functions at the edge of the critical 
strip, and Y. Lamzouri |Laj has developed this type of ideas in a quite general 
context. Although this is not really relevant from the point of view of singular 
series, we just mention that Euler products built of local averages still make sense 
inside the critical strip for many families of L-functions, and are closely related to 
their distribution (as one can see, e.g., from the work of Bohr and Jessen [1] for the 
Riemann zeta function). On the critical line, "renormalized" Euler products still 
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occur in the moment conjectures for L-functions (see, e.g., |KSj ). although other 
factors (conjecturally linked to Random Matrices) also appear. 

In the next section, we state in probabilistic terms a general result on averages 
of random Eulcr products. Then we use it to prove Theorem 11.11 and Theorem 1 1.21 
in Sections [3] and 2J In Section [5j we prove Theorem 11.31 

Notation. As usual, \X\ denotes the cardinality of a set. By / <C g for x E X , 
or / = 0(g) for x G X , where X is an arbitrary set on which / is defined, we mean 
synonymously that there exists a constant C ^ such that |/(x)| ^ Cg(x) for all 
x G X. The "implied constant" is any admissible value of C. It may depend on 
the set X which is always specified or clear in context. On the other hand, / ~ g 
as x — > xq means //g ^ 1 as i - > xq. 

We use standard probabilistic terminology: a probability space (fi, E, P) is a 
triple made of a set 51 with a er-algebra and a measure P on E with P(tl) = 1. A 
random variable is a measurable function £1 — > R (or Q — > C), and the expectation 
E(X) on f2 is the integral of X with respect to P when defined. The law of X is 
the measure v on R (or C) defined by v(A) = P(X G A). If A C ft, then 1a is 
the characteristic function of A. 

For fc-tuples h = (h%, . . . , hk), we recall that \h\ — max(|/ij|). When different 
values of k can occur, we sometimes write \h\k to indicate the number of components 
of h, in particular a sum such as 

is a sum over fc-tuples (of positive integers) with components sC h. 

2. A PROBABILISTIC STATEMENT 

We assume given a probability space (J7,E,P), and two sequences of random 
variables 

Xp , Yp '. — ► C 

which are indexed by prime numbers. 

We assume that (Y p ) is an independent sequence; recall that this means that 

P(Y P1 GA 1 ,...,Y Pk G A k )= J] ^(^eAi) 

for all choices of finitely many distinct primes p\, . . . , p k , and all measurable sets 
A t C C, and that a consequence is that (when the expectation makes sense), we 
have 

E(Y Pl ---Y Pk )=E(Y pl ).-.E(Y Pk ). 
We now extend the family to all integers by denoting 

X i = II X p> Y i = II Y p> 

p\q p\q 

for any squarefree integer q 1, and X q = Y q = if q ^ 1 is not squarefree. 
We will consider the behavior of the random Euler products 

Z x = \{{l + Xp), Z Y =H(l+Y p ) 
p p 
and in particular their expectations E(Zx) and E(Zy)- 
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For this purpose, we assume that the products converge absolutely (almost 
surely). More precisely, expand formally 

n(i+* P )=£ b 

P q>l 

b 

where ^ restricts the sum to squarefree numbers. Then we assume that 

(2.1) ^rV 9 Ki? x (*) 

q>x 

where Rx {x) is an integrable non- negative random variable such that Rx (x) — > 
almost surely as x — > +oo. It then follows that Zx is almost surely an absolutely 
convergent infinite product. 

We moreover assume that the product 

(2.2) n^+i^i) 

p 

converges (absolutely). By independence of the (Y p ), we know that 
\E(Y q )\ = \E(j[Y p )\=l[\E(Y p )\ 

p\q p\q 

and so expanding again in series, we obtain that 

(2.3) ]T b \E(Y q )\ = £' 1] \ E <Tp)\ = II (! + I 1 *™ < +°°- 

q>i g>i p\q p 

Our goal is to show that if (X p ) is distributed "more or less" like (Y p ), but 
without being independent, the expectation of Zx is close to 

+ E(Y p )). 

p 

In particular, we will typically have (X p ) depend on another parameter (say h), 
in such a way that X p ,h converges in law to Y p (which will remain fixed) when 
h — >• +oo, and this will lead to the relation 

^Hm^n (1 + X Pth j) = [] (1 + E{Y P )) 

p p 

in a number of situations. We interpret this as saying that (when applicable) the 
average of the Euler product Zx is obtained "as if" the factors were independent, 
and taking the product of the local averages 1 + E(Y p ) of the "model" random 
variables defining Zy ■ 

Here is the precise (and almost tautological) "Unitary" statement from which 
applications will be derived. 

Proposition 2.1. Let (X p ), (Y p ) be as above. Then for any choice of the auxiliary 
parameter x > 0, we have 

E(Zx) =Jl(l + E{Y P )) + 0(e{R x (x)) + Y! \E{X q -Y q )\+ ^ \E{Y q )\) , 

P g<x q>x 

where the implied constant is absolute, and in fact has modulus at most 1 . 
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Proof. This more or less proves itself: for any x 1, write first 

P q^l q^.x q>x 

then use (|2.ip to estimate the second term, and take the expectation, which leads 
to 

E(Z X ) = Y. E ( X «) + 0(E(R x (x))). 

qZ^x 

Next, we insert Y q by writing X q = Y q + [X q — Yq), getting 

E{Z X ) = ^ E{Y q ) + ^ E i x i - Y i) + 0(E(R x (x))) 

q^x q^x 

and then use 

Y' E(Y q ) = Y" E{Y q ) + o(Y" = II C 1 + )) + °(E b ' 

q^x g>l 9>a; P 9>a; 

to conclude the proof. □ 

Remark 2.2. Observe that by (|2.3p . the last term in the remainder tends to zero as 
x — > +oo. Moreover, if R x (x) is dominated by an integrable function as x — > +oo, 
the assumption that R x (x) — > almost surely implies that the first term also tends 
to zero. Thus to conclude in practical applications, one needs to control the middle 
term. 

In terms of the "extra" parameter h mentioned before the statement of the 
proposition, we may typically hope for uniform estimates for E(R x (x)), in terms 
of h, say 

E{R x {x)) < h a x-P, a, 13 > 0; 
if we also have a bound of the type 

(2.4) E(X q )=E(Y g )+0(q?h- s ), 7 , S > 0, 

(or if this holds on average over q < x, which may often be easier to prove, as is 
the case for the error term in the prime number theorem, as shows the Bombieri- 
Vinogradov theorem), this leads to a remainder term which is 

< h a x- fj + x 1+ ~<h- s + e{x) 

with e(x) ->0asi-> +oo, uniformly in h. Then we can conclude that 

(2.5) lim E(Z X ) = U(1 + E(Y P )) 

P 

by choosing x suitably as a function of h, provided we have 

a 5 

p < 7~TT 

We will see this in action concretely in the next sections. Notice that if a can be 
chosen arbitrarily small (i.e., R x (x) is bounded almost uniformly in terms of h), 
then this condition can be met. 
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Remark 2.3. If we assume, instead of (|2.2p . that the product of 1 + Jj»(|lp|) con- 
verges, which is stronger, it follows that ^ |lp| < +00 almost surely (its expecta- 
tion being finite), and hence the infinite product defining Zy converges absolutely 
almost surely. Also, since we have 



E(l[(l + Y p j) = l[(l + E(Y P )) 



for all P, we would obtain 



E(Z Y )=H(l + E(Y p )). 



provided Zy converges dominatedly, for instance. This formula is also valid if 
Y p ^ 0, by the monotone convergence theorem. It provides an interpretation of the 
right-hand side of (|2.5p . 

3. Moments of singular series for the /c-tuple conjecture 

In this section, we prove Theorem II .![ which includes in particular Gallagher's 
theorem, in a way which may seem somewhat complicated but which clarifies the 
result. 

We first assume an integer k 1 to be fixed. We rewrite as 

- k - v p (h)p k - 1 - (p- l) fe * 



6w=n( i +- 



It is therefore natural to define 

n k T lrn k— 1 ( ^ 1 \k 

a(p, v) 



(P - l) fc 
p k - vp^ 1 — (p— 1) A 



(p - 

for all primes p and real numbers < v ^ p (omitting the dependency on k). 
We then define a m {p, v), for m € C with Re(m) ^ 0, by requiring that 

l + a m (p,v) = {l + a(p,v)) m , 

with the convention m = if Re(m) = 0; the condition v ^ p implies that 
1 + a(p, v) ^ 0, so this is well-defined indeed. (If we assume v < p, we may extend 
this to all to € C). 

We first need a technical lemma. 

Lemma 3.1. For m g C with Re(m) ^ 0, write m + = if Re(m) < 1, and 
m, + = m — 1 otherwise. For all p prime and v with 1 ^ v <J min(p, k), we have 

(3.1) a m (p,fc) « ^fl + ofi 

(3.2) amM < ^( 1 + °(^)) m > *fl^v<k, 
where the implied constants depend only on k. 
Proof. Notice first that, in the stated range, we have 

a(p,k) <_fT 2 , 

a(p, v) < p^ 1 , if 1 < v < k, 
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where the implied constants depend only on k, and then write 

a m (p, v) = (l + a{p, u)) m - 1 = ma{p, v) [ (1 + ta(p, v)) m - l dt 

Jo 

and estimate directly. □ 

We are now going to prove Theorem 11.11 Fix h ^ 1 (though h will tend to 
infinity at the end). We first interpret the m-th moment of the singular series in 
probabilistic terms, then introduce the source of its limiting value in the framework 
of the previous section. 

Consider the finite set (again, depending on k) 

ill = {h — (hi) | 1 ^ hi ^ h, hi distinct}, 
with the normalized counting measure. Denoting = |f2i|, notice that 
(3.3) h% = h k (l + 0(h- 1 )) 

for h ^ 1, the implied constant depending only on k. We will denote by E\ and Pi 
the expectation and probability for this discrete space. So we have, for instance, 
that 

p 1 ( Vp = v) = ^\{h&n 1 | v p {h) = v}\. 

Our goal is to hnd the limit as h — > +oo of the average 

k |h|<ft 
hi distinct 

(notice that, by (|3.3p . if the limit exists, it is also the limit of 

\h\^h 
hi distinct 

as h — > +oo). 

We write X p (h) — a{p, v p (h)) and X p (m, h) — a m (p, v p {h)), so that 



H(i + x p (m,h)) = e(hy 



by construction. 

Now consider a second space 



n 2 = n ( z /p z ) k 



with the product measure of the probability counting measures on each factor. We 
denote by w = (h p ) p the elements of O2 ■ To avoid confusion with v p defined for 
/i g we introduce the random variables 



Pv 



na->{l,...,fc} 

uj = (h p ) p 1 — y number of distinct hi in Z/pZ, 
which satisfy 1 ^ p p ^ min(fc,p). 
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We can now define "random" singular series using O2, writing Y p = a(p, p p ) and 
considering the Euler product 

n( i+y p)' 

and similarly with Y p (m) — a m (p,p p ) and 



H(l + Y p (m)) = (H(l + Y p ) 



V V 

We denote by Pi and E 2 the probability and expectation for this space. By con- 
struction of O2, the random variables (p p ) are independent, and so are the {Y p ), and 
the (Y p (m)) for a given m. Note also that the components h p are equidistributed: 
for any prime p and any a € (Z/pZ) k , we have 

(3.4) P 2 (h p = a) = ^. 

We now use Proposition 12. II to compare the average Ei(&(h) m ) with 

l[E 2 ((i+Y P r). 

Although this proposition is phrased with a single probability space fi on which 
both Euler vectors are defined, this is not a serious issue and the statement remains 
valid, provided the expectations are suitably subscripted and one writes 

ExiXqim)) - E 2 (Y q (m)) 

on the right-hand side instead of \E(X q (m) — Yq(m))|H 

We start by estimating the tail R(x) = Rx(m) (x) of the Euler product defining 
&(h) rn . In keeping with probabilistic conventions, we omit the argument h € Qi 
in many places. Denoting 

A(h) = lYiihi-h^ ^ 1, 

and noting that v p = k unless p \ A, we have from Lemma |3. 1 1 the bound 

'(P,A)- - 



|X p (m)|«|m|(l + 0(^)) (p,A)p- 



for some C > (depending only on k) and all h, m (with Rc(to) ^ 0) and p, the 
implied constant depending only on k (this justifies, in particular, the convergence 
of the Euler product Zx for every h). Hence, taking the product over p \ q for a 
squarefree integer q, we get 

,(p,A)\" + 



\X q (m)\ ^ (\m\B)"W(q,A)q- 2 l[(l + C 
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^ We could also simply consider Q = Qi X Q2 with the product measure, or equivalently (and 
maybe more elegantly) assume that we start with some space f! and two vectors (X p ), (Yp), 
distributed according to the prescription of Oi and O2 respectively, i.e., with 

P(X P = a) = -^\{h e Hi \ a(p,u p (h)) = a}\, 
P(Y p = a) = ±-\{h€ (Z/pZ) fe I a(p,p p (h))=a}\. 
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for some constants B > and C > depending only on k. Since A is bounded by 

(3.5) |AK(2/0 fc2 , 

a standard computation with sums of multiplicative functions leads to 

2 VgMI < x- 1 (log2/ix) D 

for a; ^ 2 and some constant D ^ 0, depending on fc and to. 

The next step is to justify the analogue of the convergence of (|2.2p ; more precisely, 
we have 

(3.6) ~[[(l + E 2 (\Y p (m)\)) < +co. 

Indeed, Lemma T3 . 1 1 leads to 

E 2 {\Y p (m)\) «p- 2 +p- 1 P 2 ( Pp < k) « p- 2 

for p ^ 2, where the implied constant depends on k and to, since it is clear that we 
have 

(3-7) P 2 (P P <k)^ 

for all primes p and fc ^ 1 (write that the event {p p < k} is the union - not 
necessarily disjoint - of the k{k — l)/2 events hi — hj with i ^ j, each of which has 
probability 1/p by uniform distribution (|3.4p ). By independence, we then also get 

(3.8) E 2 (\Y q (m)\)^A^q- 2 . 

for all squarefree integers q and some constant A ^ 1, which depends only on k and 

TO. 

Finally, it remains to estimate Ei(X q (m)) — E2(Y q (m)). We claim that, for any 
a S C, we have 

(3.9) Pi(X,(m) = a) = (l + o(|))p a (y,( m ) = a) + o(^) 

where the implied constants depend only on fc. Assuming this, and noting that 
X g (m) and Y q (m) take the same finitely many values (at most distinct values, 
which are 

pu(g) 

«C 

Q 

where the implied constant and F depend on to and k), it follows that 



E x {X q {m)) = [l + 0^))E 2 (Y q (m)) + 
where G depends on to and fc, leading in turn to 

£i(X ? (m)) - E 2 (Y q (m)) « ^E 2 (\Y q (m)\) + — « — 
(see (|3.8[l ). where the implied constant depends only on k and to, as does E\ 
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Summing over q < x, it then follows from Proposition 12.11 that 
^ &(h) m = E 1 ([] (1 + X p (m))) = J] (1 + S 2 (y p (m)))+ 

fe h P P 

O^^log^x) 5 + ^(log^/ix) 15 

for some B depending on k and m. Choosing for instance x = h 1 / 2 leads to the 
existence of the m-th moment of singular series, with limiting value given by 
(3.10) 

.M-na+ftow«o»-nH)"'"{£ £ O-^n- 

p p he(z/ P z) fe 



MM 



It only remains to prove (|3.9|) . Note that this is clearly an expression of quanti- 
tative equidistribution (or convergence in law) of X q to Y q as h — > +00Q 

The proof is quite simple. First of all, given arbitrary integers s p with p | g, we 
have 



Pi{v p {h) = s p for p | q) = — ^ 1 



D p (h)=s p for p\q 
\h\^h 



k 

Pp(h p )=s p \h\^h 



h p e(Z/pZ) k h=h p (modp|ij) 

(where there are as many outer sums in the last line as there are primes dividing 
q, and the last sum involves summation conditions for all p \ q). This inner sum is 

(3.11) ]T* 1= J2 i + o^- 1 ) 

\h\^h \h\^h 
h=h p (modp\q) h=h p (modp|ij) 

where the implied constant depends on k (i.e., we now forget the condition on h to 
have distinct components). Lattice-point counting leads to 

|/i|<h 
h=h p (mod p\ q) 

where the implied constant depends again only on k. In view of the equidistribution 
of hp for {hp) p G £^2, we therefore derive from the above the following quantitative 
equidistribution result: 

(3.12) P x {u p {h) = Sp for j> | q) = P 2 (p P (hp) = s p fovp\ q) (l + o(|)) + C>(~) • 

Now to derive Q3.9p . we need only observe that Y q (m) and X q {m) are "identical" 
functions of p p and J/ p respectively (for p\ q). Hence (|3.12[) implies (I3.9[) by summing 
over all possible values of {s p ) p \ q leading to a given a, using the fact that there are 
at most k"^ such values (the latter being a very rough estimate!). 

It remains to prove the symmetry property (ll.6[) to finish the proof of The- 
orem [TTTJ We note in advance that since &(h) = 1 for all 1-tuple h, we have 



4 It can also be interpreted as a form of "sieve axiom" . 
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Hi(m) = 1 for all in ^ 1, and hence /ifc(l) = 1 for all fc ^ 1, which is Gallagher's 
result (fL5| . 

The symmetry turns out to be true "locally", i.e., the p-factor of the Euler 
products (|3.10[) defining yLtfe(rn) and /i m (fc) coincide for all p and integers fc, to ^ 1. 

There are different ways to see this, and the following seems to encapsulate the 
origin of the phenomenon. Given a finite set F (which will be Z/pZ), consider the 
following obviously symmetric expression of to and fc: 

v V i 

|^|m+fc / < / < 

xeF m . h£F k 

{^,}n'{h J }=0 

(which is the probability, for the normalized counting measure on p k + m j that a 
pair of a fc-tuple and an m-tuple, both of elements of F, do not contain a common 
element). Then it can be interpreted either as 

\p\m ^2 X/ \p\k X/ 1 = \p\m X/ X/ {} ~ Tp\ 

T=1 xEF m heF h T = 1 x£F m 

p(x)=t {hj}n{xi}=Q p(x)=t 

i ^2 (i- p ^ 



W\ m x ^ m V \F\ 
or (by the same computation with m and fc reversed) as 

_L v (\ - p ( h h m 

(using p(-) to denote the number of distinct elements in F of an TO-tuple, then of a 
fc-tuple). 

Applied with F = Z/pZ, up to the symmetric factor (f — l/p)~ km in (|3.10p . the 
first is the p- factor for /i m (fc), and the second is the p- factor for /^(to), showing 
that they are indeed equal. 

Remark 3.2. Quantitatively, we have proved that 

\h\^h 

for any e > 0, where the implied constant depends on fc and m. For to = f, 
Montgomery and Soundararajan |MS1 (17), p. 593] have obtained a more refined 
expansion with contributions of size h k ~ 1 \ogh and h , and error term of size 

Remark 3.3. The fact that jUfc(l) = 1 can be used to recover the combinatorial 
identities used by Gallagher |Ga| p. 7-8] instead of the probabilistic phrasing 
above. We review this for completeness: in order to prove /Ufc(l) = 1, it suffices to 
show that the average of a(p,p p ) is zero. We have 

a(p,p p (h)) = J2a(p,v)\{hE(Z/pZ) k | p p {h) = v}\ 
he(z/ P z) k v=i 
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and on the other hand, we have 

\{h e (Z/ P Z) k | Pp (h) = v}\ = 

where {^} is the number of surjective maps from a set with fc elements to one 
with v elemental; indeed, a fc-tuple h with v distinct values is the same as a map 
{1, . . . , fc} — > 7i/p7i with image of cardinality v, i.e., the set of such tuples is the 
disjoint union of those sets of surjective maps 

{1,. ..,*}-► J 

over I C Z/pZ with order v. 

Therefore, Gallagher's result follows from the identity 

x><->e„){*H 

which is proved in [Ga, p. 7], and which we have therefore reproved. Similarly, the 
identities 

of |Ga[ p. 8] can be derived from the proof that the p-factor for /Ltfe(l) is 1. 

Remark 3.4. From (ll.2p . one can guess that /Ufc(m) = fi m (k) for m ^ 1 integer, by 
computing 

E(E II A(n + / ll ))" l = E E II A(n J+ /w) 

\h\^h n^Nl^i^k \h\ h ^h |n| m <iV l^i^fc 

(where n is an m-tuple), which is a symmetric expression in n and h, except for 
the ranges of summation, and which should be asymptotic to either Hk(rn)h k N m 
or fi m (k)h k N m by a uniform fc-tuple conjecture. In fact, the computation we did 
amounts to doing the same argument locally (i.e., looking on average over h at the 
distribution of integers such that, for a fixed prime p, n + hi,. . . , n + /i& are not 
divisible by p) . 

This symmetry /Zfe(m) = /i m (fc), despite the simplicity of its proof, is a very 
strong property, as pointed out to us by A. Nikeghbali. Indeed, write Xk = Zy.k, 
the random variable given by the random singular series. Since we have 

Mfc(m)= / t m dv k {t) = E(XT), 

the symmetry implies that the sequence (E(X™))k, for a fixed value of m, is the 
sequence of moments of a probability distribution of [0,+oo[, which is a highly 
non-trivial property. We refer to the survey [Si] of the classical theory surrounding 
the "moment problems", noting that from Theorem 1 of loc. cit. it follows that, 
for any fixed m ^ 1, we have 

EE^^^w > o, EE Q,a "3' i, +j+ i W > °j 

^ This is denoted cr(k, u) in |Ga| . and it is not the standard notation, which would write r!{ } 
instead. 
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for any N ^ 1 and any complex numbers (at) € — {0}. 

It would be quite interesting to know what other types of natural sequences 
of random variables (or probability distributions) satisfy the relation E(X™) = 
E(X^ n ). One fairly general construction is as follows (this was pointed out by A. 
Nikeghbali and P. Bourgade) : just take X n = Z n for Z a random variable such that 
all moments of Z exist, or a bit more generally, take a sequence (X n ) of positive 
random variables such that the X^J n are identically distributed. But note that the 
variables we encountered are not of this type. 

Example 3.5. Let m = 2. We find (using the symmetry property) that the mean- 
square of 6(h) is given by 

\h\^h 

where 

In particular, we find (using Pari/GP for instance): 

^ 2 (2) = 2.300... fj, 3 (2) = 6.03294... 
fj,i(2) = 17.562 .. . ^s(2) = 55.255... 
/Lt 6 (2) = 184.18... 

Note that the second (and higher) moments increase quickly with k (as proved 

in Proposition 14.11 in the next section) . This is explained intuitively by the fact 

that 6(h) is often zero: for instance, the 2- factor of 6(h) is zero unless all hi are 

of the same parity, which happens with probability 2 1 ~ k only (see Example 14.31 for 

a more precise estimate). For those, of course, the 2- factor is very large (equal to 
2 k-iy 

4. Growth and distribution of moments of singular series 

In this section, we will prove Theorem II .2\ using the methods of moments. For 
this, we consider the problem (which has independent interest) of determining the 
growth of /ifc(m). We look at the dependency on to for fixed k, or equivalently the 
dependency on k for fixed to, by symmetry (as in Example 13. 5|) . The result is that 
the moments grow just a bit faster than exponentially. 

Proposition 4.1. For any fixed k ^ 1, we have 

log /^fc (to) = fcmloglog3m + O(m), for m 1, 

where the implied constant depends on k. 

Proof. We use the formula (|3.10j) . written in the form 

«<™>= n HP** -7H 

p 

We will prove first that 

log/ife(m) ^ fern log log 3to + 0(m), 

for to 1, with an implied constant depending on k, before proving the corre- 
sponding upper bound. 



1(5 



EMMANUEL KOWALSKI 



We start by checking that all terms in the Euler product are 1, i.e., for all 
primes p, all integers k and all real numbers to ^ 1, we have 

Indeed, by the symmetry between the p-factor for /Xfe(l) and for pi(fc), we have 

p) \ p 

while raising to the m-th power and applying Holder's inequality gives 



(*('-? 



p J J \\ p 

From this we can bound /^(to) from below by any subproduct, and we look at 

\ \ —km 

pJ p 



*w-nHr*((i-?)-)- 



p^m 

The probability that p p is 1 is clearly equal to (there are only p fc-tuples 

with this property). Hence we have crude lower bounds 

p / ) ^ p fe_1 ( p) 
and 

-r-r / 1 \k(m-l) I 

M fe (m) -> /4(m) ^ ]J I 1 

p^m 



p — 1 ) p k 1 

The logarithm of this expression is easily bounded from below as follows: 



log^ fe (TO) ^ k(m - 1) ^2 \og(l + — - (k - 1) ^ lo SP 

p^m ^ psCm 

= /cto log log 3m + O (to) , 

for to ^ 2, the implied constant depending only on k, by standard estimates, and 
we can incorporate trivially to = 1 also. 

To prove the corresponding upper bound, we split the Euler product (|3. 10)) into 
two ranges: we write 

/*(m)=M?W£V), 
where p k (to) is the product over primes p < km (which includes the range used 

(21 

for the lower bound), while n k (to) is the product over the other primes p ^ km. 
We will show that 



\og[A k '(m) ^ fern, log log 3m + (to), log fi { k ' (to) <C 



log 2to ' 

with implied constants depending on k, and this will conclude the proof. 

We start with small primes, and simply bound the expectation of (1 — p/p) m by 
the trivial bound 1; this leads to 

log/ij^TO,) ^ — km log^l ^ = fern log log 3to + 0(m), 

p<km 

where the implied constant depends on k, again by standard estimates. 
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Next, we estimate n k (m) more carefully. The logarithm (say C(x)) of the 
product restricted to km ^ p ^ x is given by 

£(x) = -fcm ^ log(l-p- 1 )+ ^ log£? 2 ((l-^ 

km^p^x km^p^x ^ 

Using (|3.7|) . we write first, for p ^ km, the upper bound 

£ 2 ((l-^)" 1 ) < (l-^) m (l-P2( Pp <k))+P 2 (p p <k) 

1--J +P 2 (p P <*)(l- (l- - 



^1- 

< 1 - 
= 1 - 



k\ m mk 2 (k - 1) 



(with Au = k 3 — 2k 2 ) since 



-p) 

mk 


1 

m(m - 


2p 2 

-l)fc 2 


mk 2 {k - 


-1) 


P 

mk 


2 

m 2 k 2 


pr 
mAk 

+ 


+ 2p 2 




P 


2p 2 


2p 2 






ml 
• H 


[m - 1) 
2 


a; 2 


for ^ x 


< 1, m ^ 1 



Moreover, we have log(l — x) ^ —a; — x 2 /2 for ^ x < 1, and hence after some 
rearranging, we obtain 



io gE2 ((i-^y 



<: 



P 



mk m 2 k 2 


m^4fc 1 / mk 


m 2 k 2 


mA k \ 2 


> + ^r' 

p ip 


f 2p 2 2 V p 


2p 2 


2p 2 ) 


m 3 k 2 m 4 k 4 


m^lfc m 2 kAk 


m 2 A\ 


- 2m 3 k 2 A k 


p 3 8p 4 4 


2p 2 2p 3 




8p 4 



the terms involving (m 2 k 2 )/(2p 2 ) having cancelled out. 

Summing over km ^ p $J x, we can let x go to infinity in all but the first resulting 
term since they define convergent series; bounding the tail by 



— ^(km) 1 -' 7 {\og2km)-\ 
p a 

p>km 

leads to 

fem^p^sc km^pi-Lx 

for all m and x ^ fcm, where the implied constant depends on fc. Finally, 
log£(,)<-^ £ (i +l0g (i-i)) +O (^), 

and since p _1 + log(l — p _1 ) defines an absolutely convergent series with tail (for 
p > y) decreasing like y (logy) , we obtain the desired bound for 



(2)/ 

1,1.) — mil 
x— )-+oo 

□ 



log/ii. (m) = lim £(a; 

a:— ^+oo 
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The existence of a limiting distribution (Theorem II .2|) is an easy consequence of 
this. 

Corollary 4.2. Let k 1 be a fixed integer. As h goes to infinity, the singular 
series &(h) for h 6 fii, i.e., such that \h\ ^ h, converges in law to the random 
singular series 

p 

on ■ In other words, there exists a probability law v% on [0, +oo[, which is the 
law of Zy , such that &(h), for \h\ ^ h, becomes equidistributed with respect to Vk, 
or equivalently 



, H ? Tk E* /( 6 ( h )) = / /(*)**(*) 

\h\<h 



/R+ 

for any bounded continuous function on R. Moreover we have 
(4.2) fi k (m) = E 2 (Z?) = f t m dv k {t). 

JR+ 

Proof. First of all, using f!3. 10[) . the monotone and dominated convergence theorems 
and (|3.6[) imply that we have 



(4.3) /i fc (m) - E 2 (Z?) 

for all integers m 1. Now a standard result of probability theory (the "method 
of moments" ) states that given a positive random variable X and a sequence of 
positive random variables (X n ), such that E(X m ) < +oo, E(X™) < +oo for all n 
and m, the condition 

lim E(X™) = E{X m ) 

n— >+oo 

for all m ^ 1 implies the convergence in law of X n to X, if the moments E(X m ) 
do not grow too fast (a sufficient, but not necessary condition). In fact, it is enough 
that the power series 

^— ' m! 

have a non-zero radius of convergence, which in our case holds (with X = Zy) 
by the almost exponential upper bound for ^tfc(m) in Proposition ^. II Finally, the 
formula (1431) follows from (1431). □ 



Example 4.3. As a corollary of Proposition 14. II and symmetry, we have 

log ^(2) = 2k log log 3fc + 0{k) 

for k ^ 1. 

Combined with the classical lower bound for non- vanishing arising from Cauchy's 
inequality, it follows that for every fixed k ^ 1, we have 

liminf -^Uh I \h\ h and G(h) ^ 0} ^ > exp(-(2fcloglog3fc + O(fc)))- 

h^+oo h k Mfc(2) 
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This is close to the truth, as one can check by noting that we have in fac10 
lim ±-\{h | \h\^h and 6(h) ? 0}| = P 2 (Z Y ,k + 0) = TT P 2 ( Pp < p) 

h— >+oo n 

using the almost sure absolute convergence of the random Euler product Zyk- We 
have the bounds 

(p ~ 1)" . „ ( ^ w P{p-l) k 
pk < P *(Pp <P)^ ~k 

(since, for p ^ k, a fc-tuple will have p p < p only if it omits at least one value in 
Z/pZ; the lower bound follows by looking at those omitting 0, for instance, and the 
upper one is a union bound over the possible omitted values), from which we get 

-fcloglog3fc + 0(fc) log P 2 (Z Y .k ^0) sc fc-fcloglog3fc + 0(fc), 

i.e., we have 

P 2 {Z Y .k ^ 0) = cxp(-fcloglog3fc + 0(h)). 

It follows from this that if we replace the space fii of all fc-tuples with distinct 
entries by the much smaller one 

n 1 = {hen 1 | e(h) ^ o}, 

(which still depends on h, with cardinality hk), the singular series still has a limiting 
distribution when interpreted as a random variable on fii with h — > +oo: indeed, 
this is the distribution £//. given by 

^(An]o,+oo[) 

^t(J0, +oo[) 

since, for any integer m ^ 1, we have 

h fc ^(^y.fe^lJ) J[o,+oc[ 

as /i — > +oo. 

Of course, those moments do not satisfy the symmetry property enjoyed by 
/ifc(m). 

Remark 4.4. Before going on to the second part of this paper, the following question 
seems natural: are there arithmetic consequences (possibly conditional, similarly to 
Gallagher's proof of (|1.7p ) of the existence of m-th moments of the singular series 
for fc-tuples? 



5. POISSON DISTRIBUTION FOR GENERAL PRIME PATTERNS 

In this section, we prove Theorem 11.31 essentially by following Gallagher's re- 
duction to averages of Euler products, which turn out to be easily computable after 
application of Proposition ^. 11 

We fix a primitive family of polynomials / with 6(f) (the reader may want 
to review the notation in the introduction for what follows). To apply Gallagher's 
method, we also require some auxiliary families of polynomials, indexed by fc-tuples. 



This does not follow directly from convergence in law for &(h), but from the absolute 
convergence and local structure of the singular scries. 
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Thus let fc 1 be an integer and h a fc-tuple of integers. For our fixed primitive /, 
we denote 

f®h = if j (X + h i )) l<j<m1 

which is a family of km integer polynomials. 

Technical difficulties will arise because this family may not be primitive, even 
if the components of h are distinct (which is a necessary condition), i.e., we may 
have an equality 

f jl (X + h il ) = f j2 (X + h i2 ), 

for some ii i 2 , ji ^ ji- 

For instance, we have (X, X + 2) (3, 1) = (X + 3,X + 1, X + 5, X + 3) (in the 
case of twin primes). However, we will show that these degeneracies have no effect 
for the problem at hand. Moreover, / h is primitive whenever h has distinct 
arguments, in the following quite general situations: 

- if m = 1; 

- if the degrees of the fj are distinct; 

- if no two among the polynomials fj are related by a translation Xi->I + a, 
for some a 6 Z. 

This means that the reader may well disregard the technical problems in a first 
reading (for the twin primes, see also Example 15.91 which explains a special reason 
why the degeneracies have no consequence then). The following lemma is already 
a first step, and we will need it before proving the full statement. 

Lemma 5.1. Let f be a primitive family and k ^ 1. Then for any h J* 1, we have 

\{h | \h\k ^ h, f Q h is not primitive}] <C h k ~ 1 

where the implied constant depends only on k and m. 

Proof. Let / be the set of fc-tuples h with distinct components such that / h is 
not primitive. If h € /, then there exists at least one relation of the type 

(5-1) fj^X + hiJ = f h (X + h i2 ), iiyti 2 , ji + 32, 

hence 

f n (X) = f h (X + h l2 -h n ), 

so the two polynomials differ by a "shift". Let 1Z be the set of pairs (ji, ,72) for 
which 

f jl (X) = f j2 (X + 8U1J2)) 

for some integer 5(j\, J2) 7^ 0. Because the polynomials involved are non-constant, 
this integer is indeed unique. The cardinality of 1Z is bounded in terms of m only, 
and from the above, any fc-tuple h € I must satisfy at least one relation 

hix - h 2 = S(jx,j 2 ), 

for some i\ 7^ i 2 and (ji,j 2 ) 6 1Z. Each such relation is valid for at most h k ~ 1 
among the fc-tuples with \h\ ^ h. □ 

We will deduce Theorem 1 1 . 31 from the following (unconditional) result, which is 
another instance of average of Euler products: 
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Proposition 5.2. Let f = (fi, . ■ ■ , fm) be a primitive family and k ^ 1 an integer. 
Then we have 

. lim i£*6(/ ©/!) = ©(/)*, 

h->+oo h * — ' 

where ^ /iere restricts the summation to those k -tuples for which fQh is primitive. 

Remark 5.3. Taking / = (X), with ©(/) = 1 and / h = (X + hi, . . . , X + h k ), 
we recover once more Gallagher's result (jl.5l) . 

We have the following complementary statement, which is also unconditional 
(recall that, in many cases, it holds for trivial reasons; it does not follow trivially 
from Lemma 15.11 because although fewer fc-tuples are concerned, the number of 
prime seeds increases when / h is not primitive). 

Lemma 5.4. Let f — (/i, . . . , f m ) be a primitive family with ©(/) ^ 0, and k 1 
an integer. Then for any N 2, if h A(log N) m for some A > 0, and for any 
e > 0, we have 

^ /0 ^<<(io7W^ 

\h\ k ^h 

fQh not primitive 

* 

where ^2 restricts the sum to those k-tuples with distinct entries, and where the 
implied constant depends only on k, f, A and e. 

Here is the proof of the (conditional) Poisson distribution, assuming those two 
results. 

Proof of Theorem \1.3[ The argument is essentially identical with that of Gallagher, 
but we reproduce it for completeness, and so that the necessary uniformity in the 
Bateman-Horn conjecture becomes clear. 

Because the Poisson distribution is characterized by its moments, it is enough 
to prove that for any fixed integer k ^ 1, we have 

(^(n + AJ(iV,/);/)-7r(n;/)) fc as N -> +oo, 

where Pa is any Poisson random variable with mean A. 

Write h — X5(N, f). Expanding the left-hand side, we obtain 

n^N n<m,^n+)> 

rcii /-prime seed 

where there are k sums over mi, . . . , mu- Write m, = n + hi, so that 1 ^ hi ^ h, 
and the condition becomes that fj(n + hi) is prime for all i and j, i.e., that n be 
an / /i-prime seed. Exchanging the order of summation, we get 

1 £ AN; fQh). 

\h\ k t^h 

Before applying (jl.8l) , we need to account for the fc-uples which do not necessarily 
have distinct components, and for those where / ft. is not primitive. 

For this, observe first that tt(N; fQh) only depends on the set containing the 
components of the fc-tuple h. This justifies the fact that the reorderings that 
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follow are permissible. For each r, 1 r k, and each r-tuple h! with distinct 
components, the set of those fc-tuples for which the set of values is given by the set 
of components of h has cardinality depending only on r and fc, but independent of 
b! , and in fact it is given by {^} (one can assume that h! = (1, . . . , r), and obtain 
a bijection 

{suitable fc-tuples} — > {surjective maps {1, . . . , k} — > {1, . . . , r}} 
h !->■(/: i >->■ hi) 

between the two sets). 
Then we can write 

\h\ k ^h r=l ' > 

where we divide by r! because we sum over all r-tuples instead of only ordered ones, 

and ^2 restricts to r-tuples with distinct entries. 

Now, for each r, we separate the sum over r-tuples for which / h' is primitive 
from the other subsum. Applying (jl.8l) and using the easy bound 

c(fQh') «c(/)|/iT axd ° s(/j) , 

(where the implied constant depends on r and /) the first sum (still denoted ) 
is equal to 

Aim i i 



;* 6(/©fco(i+o(j 



4^ r! \rj peg(/) r (log TV)™ ^ y J \ \ log TV 

r—l \h'\ r ^~h 

for any e > 0, where the implied constant depends on /, k and e. Using Proposi- 
tion l5.2l and the choice of h — Apeg(/)(5(/)~ 1 (log TV) m , this converges as TV — >• +oo 
to the limit 

which is well-known to be the k-th moment of a Poisson distribution with mean 
A (this is checked by Gallagher for instance, see [Gal §3]). Hence, to conclude the 
proof, we need only notice that Lemma 15.41 (applied with k = r for 1 ^ r ^ k) 
implies (taking e — 1/2 for concreteness) that the complementary sum is bounded 
by 

l^lr) £ ^;/0fc')«(logAT 1/2 



\h'\ r ^h 
fQh / not primitive 



for TV ^ 2, where the implied constant depends on k, f and A. Hence this second 
contribution goes to as TV — s> +oo, as desired. □ 



We now prove Proposition 15.21 This is the conjunction of the two following 
lemmas, where we use the same notation as in Section [3l but change a bit the 
definition of probability spaces. Precisely, 



n 2 = Y[(z/ P z) k 
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is unchanged, but we let 

£li = {h — (hi, . . . , hk) | 1 ^ hi ^ h, f h is primitive} 

with the counting probability measure (note that the condition forces h to have 
distinct coordinates). By Lemma |5.1[ note that we have 

(5.2) |f2i|-/i fc as h ->• +oo. 

The next lemma shows that the average of Euler product involved can be com- 
puted as if the components where independent: 

Lemma 5.5. Let 6(f) = (f%, . . . , f m ) be a primitive family with &(f) 0. Then 
for any k ^ 1, we have 

j^E^>=^MnKP(i-^)) 

\h\^h P 1 r 

-iWHrv**)). 

V 

where 

v p ,f(h) — v p (f h) for h — (hi, . . . , hk) with hi > 1, 
p p . f (h) = \{x e Z/pZ I f,j(x + hi) = /or some i, j}\ for h E (Z/pZ) r . 
The second lemma computes the limit locally: 

Lemma 5.6. Let f = (f±, . . . , f m ) be a primitive family. Then for any k ^ 1 and 

any prime p, we have 




Looking at the definition (|1.4j) of ©(/), both lemmas together prove Proposi- 
tion [572] We start by proving Lemma IBTBl because Lemma [5751 is certainly plausible 
enough in view of Section |3l and the reader may be more interested by the final 
formal flourish. 

Proof of Lemma \5.6l It suffices to compute 

*(i-£*) 

since the other factor is the same on both sides. We argue probabilistically, although 
one can also just expand the various sums (and do the same steps in a different 
language, as we did when proving the symmetry (|1.6[1 ). We can write 

1 _PpJ = hz/pZ-M\ 
p p 

where M C Z/pZ is the (random) subset of those x € Z/pZ such that fj(x+ht) — 
for some i and j. We write 

\Z/pZ-M\= (!-Xm(z)) 
xez/pZ 

where xm(x) is the random variable equal to one if x G M and zero otherwise. We 
have 
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say. Since £f,i(x) only involves the i-th component of the random h £ VL2, the 
family (£,f,i(x)) is an independent fc-tuple of random variables. Consequently we 
derive 

E ^- p -f)=- p e Mn^)) 

xeZ/ P Z l<i<fc 

- 1 E n mt.ii*))- 

y xez/pZ l^i^fe 

To conclude we notice that for every x and i, h 1— > x+hi is identically (uniformly) 
distributed, so that all £,f,i{x) are identically distributed like 

e/ = £/,i(o)= n (i-i{/ 3 (^)=o } )- 

Hence all x give the same contribution, and we derive that 

153(l - E f)= Ei(Zf) k = IMA^l) • • ■ fm(hi) + 0) k = (l - ^) fe , 

since is uniformly distributed in Z/pZ. □ 

To prove Lemma l5.5[ we wish to apply Proposition 12. II A complication is that, 
if peg(/) 7^ 1, the singular series ©(/ h) are not defined by absolutely convergent 
products, and therefore the result is not directly applicable. However, we can bypass 
this difficulty here without significant work because of the following fact: all the 
relevant Euler products can be uniformly "renormalized" to absolutely convergent 
ones. This is the content of the next lemma. 

Lemma 5.7. Let f be a primitive family with &(f) ^ 0, and let k ^ 1 be an 
integer. There exist real numbers 7 P (/) > 0, for all primes p, such that the product 

converges, and such that the following hold: 

(1) For all prime p, and all k-tuple h G (Z/pZ) k , we have 

(1 - i) ~ km (1 - = 7p(f) x (1 + X pJ (h)) 

for some coefficients X p j(h), and for all k-tuple of integers h such that f h is 
primitive, the product 

(5.3) ]J(l + X Vtf (h)) 

p 

is absolutely convergent. 

(2) We have 

a. h e* n c 1 + = n (! + Mx P<f )), 
\h\ k ^h p p 

where the sum is over k-tuples with f h primitive. 
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Proof. (1) To define J p (f), let 8j, 1 ^ j ^ m, be a complex root of the irreducible 
polynomial fj, and let Kj = Q(dj) be the extension of Q of degree deg(/j) generated 
by 6j. Then put 



V 



where Tj (n), for n ^ 1, is the number of prime ideals of norm n in the ring of integers 
of Kj. In view of this definition, to check first that the product of 7 P (/) converges, 
we can do so for each fj separately. Then the statement follows, after taking the 
logarithm of a partial product over p ^ X, from the well-known asymptotic formula 



J2 r -M=J2-+c(K 3 )+0((lo g X) 



-1\ 



p * — ' p 

for X ^ 2, where c(Kj) is a constant depending only on Kj, and the implied 
constant also depends only on Kj. 

It therefore remains to prove that the product (|5.3p is absolutely convergent for 
any fc-tuple of integers h with f Qh primitive. To do so, we claim that there exists 
an integer D(h) > 1 (which may also depend on /) such that, for p j D(h), we 
have 

m m 

(5.4) PpJ {h) = k^v p {fj) = kJ2 r j(p)- 

The desired convergence then follows from that of 

n^-^-idS). n Hr~ w (i-*^). 

and the latter is clear since the p- factor can be written 1 + 0(p~ 2 ), where the 
implied constant depends only on k and /. 
The existence of D(h) is easy; first, let 

£>i(h) = | [] Res(fj(X + hi)Jj>(X + h v ))\, 

where Res(-, •) is the resultant of two polynomials. By compatibility of the resultant 
with reduction modulo p, we have p | D\(h) if and only if, for some (i,j) ^ 
(i',f), there exists a common zero x G Z/pZ of fj(X + hi) and fj>(X + he). By 
contraposition, we first obtain 

rn 

p p j(h) = kv p (f) = k^Vpifj), 

3=1 

for p \ Di(h) (the sets of zeros modulo p of the components of / © h are then 
distinct, and obviously there are as many, namely the sum v p {f) of the Vp(fj), for 
each of the k shifts hi). 

Next, it is a standard fact of algebraic number theory that for each j, there exists 
an integer Aj ^ 1 such that v p {fj) — fj{p) for p\ Aj. Thus we can take 

D(h) = D 1 (h) Aj 
to obtain the second equality in ([57 
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Note that D(h) is non-zero (hence ^ 1) because otherwise, there would exist a 
common zero 9 £ C of fj(X+hi) and fji (X+hi'), and because those are irreducible 
integral primitiv^ polynomials with positive leading coefficient, this is only possible 
if 

f j (X + h i ) = f j ,(X + h i ,), 

which is excluded by the assumption that / ft. be primitive. 
Note in passing the estimate 

D(h) <C (2\h\ k ) 2k2,n ^ dcsif ^ 

for all ft, where the implied constant depends only on /; this follows straightfor- 
wardly from the determinant expression of the resultant in D\(h) (see, e.g., [Lj 
§V.10]). 

(2) With the bounds we have proved on X p j(h) (leading to an analogue of 
Lemma nO]) . and the estimate on D(h) (analogue of p.5[) ), together with Lemma l5~Tl 
to ensure that the equidistribution of fc-tuples modulo squarefree integers q remains 
valid (compare with p. Ill) ), we can pretty much follow the steps of the proof of 
Theorem 11.11 We also use (|5.2[) to go from the limit of the expectation on Hi to 
summing over fc-tuples normalized by l/h k and taking h -— > +oo. The details are 
left to the reader. □ 

Proof of Lemma 15.51 We have first 

i- £ ©(/ h) = ^ £ (n-fr(/)) n a + x p,f w) 

\h\^.h p p 

-> (l[7p(f))l[(l + E2(X p , f )) as/w+oc, 
p p 

by the above, and then we can simply write this limit as 

(n>(/)) n o- + e ^ x vj)) = n + x PJ )) 

p 

□ 

We conclude with the last remaining part of the proof, namely Lemma 15.41 
The following proof can almost certainly be improved, but although the statement 
becomes fairly clear after checking one or two examples, the author has not found a 
cleaner way to deal with the apparent possibilities of combinatorial complications. 
The point is that as / ft becomes "less primitive" (i.e., there are less distinct 
elements among the km polynomials involved), the number of prime seeds ^ N 
should increase (by a power of (logiV)), but also the number of fc-tuples with this 
property diminishes (by a power of h ^ A (log 7V) m ), and this gain has to compensate 
for the loss. 

Proof of Lemma \5.4\ We first quote a standard sieve upper-bound for an individual 
primitive family / (with m elements), which is uniform, and which allows us to 



In the sense that the gcd of their coefficients is 1. 



AVERAGES OF EULER. PRODUCTS 



27 



prove the lemma unconditionally: for N 2, for any /c-tuple h with distinct 
elements for which f Q h contains I distinct components, we have 

(5.5) Tr(N;fQh) < (loglog3|hp fem A 



(logiV)^' 

where the implied constant depends only on k and /. Precisely, (|5.5[) for fc-tuples 
follows immediately from, e.g, Th. 2.3 in IHRj , and it is easy to adapt this to the 
case at hand since uniformity is only asked with respect to h. Note also that, since 
the application we give is conditional on much stronger statements like (|1.8p . we 
could also apply the latter for this purpose. 

Now, as in the proof of Lemma 15.11 we denote by I the set of A;-tuples h with 
distinct components such that / h is not primitive. Recall 1Z is the set of pairs 
{jl,h) for which 

for some (unique) integer <5(ji, j'2) 7^ 0. 

We continue as follows: for an h £ /, let Th be the graph with vertex set 
{l,...,fc} and with (unoriented) edges (11,12) corresponding to those indices for 
which the relation 

(5.6) h n - h i2 = S(j 1 ,j 2 ) 

holds for some (ji, J2) G H', the proof of Lemma I5TT1 shows that there is at least one 
edge. Because the number of possibilities for L^ is clearly bounded in terms of k 
only, and we allow a constant depending on k in our estimate, we may continue by 
fixing one possible graph L and assuming that all h £ I satisfy = F. 

This being done, we first estimate from above the number of fc-tuples which lie 
in / (under the above assumption that the graph is fixed!). We claim that 

(5.7) \{hel I \h\ ^ h}\ sc h c 

where c = |7To(r)| is the number of connected components of V. 

To see this, notice that each connected component C corresponds to a set of 
variables which are independent of all others, so that / is the product over the 
connected components of sets Ic of |C|-tuples satisfying the relations ()5.6|) dictated 
by C . Now we have 

\{h£l c I \h\ < h}\ < h, 

because C is connected: if we fix some vertex io of C, then for any choice of /i, D , the 
value of hi is determined by means of the relations ()5.6|) for all vertices i of C, using 
induction on the length of a path from io to i (which exists by connectedness). 

Taking the product over C of these individual upper bounds, we obtain the 
desired estimate (|5.7p . 

We next need to estimate from below the number of distinct elements in the 
family / h for a fixed h £ I (still under the assumption that the graph L^ = L 
is fixed). 

Let again C be a connected component of the graph T. We consider the set (say 
{fQh}c) of polynomials of the form fj(X + hi), where 1 ^ j < m and i is a vertex 
of C. We claim this set contains at least m + 1 distinct polynomials if C has at least 
2 vertices, and m if C is a singleton. Indeed, fixing a vertex io of C, the set contains 
the polynomials fj(X + hi ), which are distinct since / is a primitive family. This 
already takes care of the case where C is a singleton, so assume now that C contains 



28 



EMMANUEL KOWALSKI 



at least another vertex i. If all the m distinct polynomials fj(X + hi) were already 
in the set {fj(X + hi )}, this would dehne a permutation a of {1, . . . , m} such that 

fj{X + hi) = f a(j) {X + hi,), 1 < j < m. 

Consider a cycle . . . ,jg) of length £ in the decomposition of a; applying the 
identity to j\, cr(ji) = j 2 , etc, in turn, we derive the identity 

f n {x) = L Hn) (x + (£- i)(h i0 ~ h)) = f n (x + (£- i)(h i0 - hi)). 

Since fj is non-constant and hi ^ hi, we deduce that 1=1; this holding for all 
cycles in a would mean that a is the identity, but then fi(X + hi) = fi(X + hi ) 
again contradicts the fact that h has distinct components. This means that a can 
not exist, and so the set {fj(X + hi)} contains at least one polynomial not among 
the first m ones, which was our objective. 

Next observe that, by the very definition of the graph T, the sets {/ h}c are 
disjoint when C runs over the connected components of T, and hence we find that 
any / © h contains at least cm + d elements, where d is the number of connected 
components of T which are not singletons. Note that d ^ 1, because T has at least 
one edge. 

We finally estimate the contribution of fc-tuples in / using (|5 . 5[) and (|5.7p : we 
obtain 

^ £ n(N;fQh)^h c (\og2h) km (logN)- cm - d 
hei 

\h\^h 

where the implied constant depends on k and /. If h ^ A(logiV) m , as assumed in 
Lemma 15.41 we obtain 

i vr(^V;/0/i)«(logiV)- £i + £ 
hei 

\h\^h 

for any e > 0, where the implied constant depends on k, A, / and e. Since d ^ 1, 
the lemma is finally proved. □ 

Remark 5.8. The gain of (log TV) -1 is indeed the best possible in general. Consider 
for example the primitive family / = (fi,f 2 , is) = {X 2 + 7, (X+2) 2 + 7, (X+4) 2 + 7) 
for which it is easy to check that ©(/) ^ (7 is not a square modulo 3 or 5, and 
each fj(0) is odd). We have relations f x {X + 2) = f 2 (X), f 2 (X + 2) = f 3 (X). 

Consider k — 2. If we look at 2-tuples h = (h%, h 2 ) for which h 2 — hi + 2, we 
obtain 

/ h = (h(X + h!),f 2 (X + hi), h{X + hi), 

fi(x + h 2 ),f 2 (x + h 2 ),Mx + h 2 )) 

= (fi(X + hi), fi(X + hi + 2),fi(X + hi+ 4), 

fi(X + h 2 ), fi(X + h 2 + 2),fi(X + h 2 + 4)) 

= (fi(X + h 2 - 2),fi(X + h 2 ),fi(X + h 2 + 2), 

MX + h 2 ), fi(X + h 2 + 2),fi(X + h 2 + 4)), 

which contains 4 distinct polynomials. With h >; A(logTV) 3 , those 2-tuples with 
\h\ ^ h contribute about iV(log TV) 3 " 4 to the sum of Lemma [5.41 (under (|1.8p . of 
course) . 
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Finally, here are a few examples. 

Example 5.9. (1) If we take f 1 = (X, X + 2), we obtain that the number of twin 
primes (p,p + 2) with n < p ^ n + A(logn) 2 should be approximately distributed 
like a Poisson random variable with mean 

2X Wi}- )« 1-320336593... A. 

Similarly, if we take f 2 = (X,2X + 1), we find that the number of Germain 
primes (i.e., primes p with 2p + 1 also prime) with n < p ^ n + A(log n) 2 should be 
approximately distributed like a Poisson random variable with mean 

A6(/ 2 ) = 2An(l-7-^ 

Two further remarks are interesting here. First, the proof of Theorem 11.31 shows 
that whenever / consists of linear polynomials (in particuler for those two results), 
"only" the (uniform) Hardy-Littlewood conjecture is needed. In other words, no 
assumption is required beyond those of Gallagher's original result for the primes 
themselves. 

Secondly, if one is interested in the case of twin primes in particular, Lemma 15.41 
has a trivial proof from the following coincidence: if / = (X, X + 2), ft has distinct 
entries, and / ft is not primitive, then 

6(/©ft)=0, n(N;f®h) < 1. 

Indeed, if / ft. is not primitive, we have k ^ 2 and an equality hi 2 — +2 
for some i\, The family / ft contains in particular the three polynomials 
X + h h , X + h l2 = X + h H + 2 and X + h i2 + 2 = X + h it + 4. Hence, to be 
a prime seed for / h 7 an integer n 1 must be such that, in particular, the 
triple (n + hi 1 ,n + + 2, n + hi t + 4) consists of prime numbers. But those three 
numbers are distinct modulo 3, showing that i>a(f Oh) =3, and the only possible 
case is (n, n + 2, n + 4) = (3, 5, 7). (Examples such as / = (X 2 + 7, (X + 2) 2 + 7) 
and h — (3, 1) show that this special situation where imprimitive fc-tuples lead to 
vanishing singular series for / ft is indeed a coincidence) . 

(2) If we take / 3 = {X 2 + 1), and renormalize in an obvious way, we hnd 
that the number of primes of the form p — n 2 + 1 in an interval of the form 
iV 2 < n ^ (N + A(logiV)) 2 should be approximately distributed like a Poisson 
random variable with mean 

"w-t n (i-^hj.) n ('-^i 

p=l(mod4) v ^ ' p=3(mod4) 1 
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